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Abstract 

The statistical properties of a stochastic process may be described 
(l)by the expectation values of the observables, (2)by the probability 
distribution functions or (3)by probability measures on path space. 
Here an analysis of level (3) is carried out for market fluctuation pro- 
cesses. Gibbs measures and chains with complete connections are 
considered. Some other topics are also discussed, in particular the 
asymptotic stationarity of the processes and the behavior of statis- 
tical indicators of level (1) and (2). We end up with some remarks 
concerning the nature of the market fluctuation process. 

Keywords: Market fluctuations, Gibbs measures. Chains with complete 
connections 



1 Introduction 

When a physical phenomenon is measured with a set of instruments, what 
we register is a sequence of values of some variable X 
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which takes values in a space Y. We will call Y the state space and the space 
of sequences Y^ the path space. Statistical properties of the phenomenon 
may be described at three different levels: 

(1) By the expectation values of the observables; 

(2) By the probability measures on the state space Y; 

(3) By the probability measures on path space Y^ . 

One obtains three different characterizations of the phenomenon which 
represent successively finer levels of description of the statistical properties. 
Borrowing a terminology used in large deviation theory |]T| 0, we will call 
these three types of description, respectively, level 1, 2 and 3- statistical 
indicators. 

To obtain expectation values and probability measures we would require 
infinite samples and a law of large numbers. For any finite sample we obtain 
finite versions of the expectation values, the probability on state space and 
the probability on path space which are called the mean partial sums, the 
empirical measures (or empirical probability distribution functions - pdf's) 
and the measures on the empirical process. 

Level- 1 and level-2 analysis are the most common ones and their statistical 
indicators the most commonly quoted when a stochastic process is analyzed. 
However to the same expectation values for the observables or to the same 
pdf's, different processes may be associated. Therefore full understanding 
of the process requires the determination of the level-3 indicators. Recent 
advances have been obtained on the identification of processes, especially in 
connection with the analysis of hydrodynamic turbulence data0 
[0]. In particular it has been clarified that analysis and reconstruction of 
the process involves two different but related steps. One is the identification 
of the grammar of the process, that is, the allowed transitions in the state 
space or the subspace in path space that corresponds to actual orbits of the 
system. The second step is the identification of the measure, which concerns 
the occurrence frequency of each orbit in typical samples. Although largely 
independent from each other, this two features have a related effect on the 
constraints they impose on the statistical indicators. 

Identification of grammars and measures (in particular Gibbs measures) 
has been dealt with recently, in particular in the context of hydrodynamic 
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turbulence and other dynamical systems. Market fluctuations is an inter- 
esting stochastic processQ [|l^. Some analogies have been found between 



this process and some of the features of turbulence data[]rT| [O. However, 



when statistical indicators are computed, it turns out that the two processes 



are different ||13|| fT^ |jT5|. Nevertheless the statistical tools that have been 



developed for turbulence are mathematical devices which are not process- 
dependent and they may be apphed to any stochastic process process. Of 
course, underlying this approach is the working hypothesis that statistical 
methods, by themselves, are an appropriate tool to describe and reconstruct 
the market fluctuation process. This hypothesis underlies the modern view 
of the efficient market, namely the idea that the market appears to overreact 
in some circumstances and underreact in others is pure chance In other 
words, the expected value of abnormal returns is zero. Contrariwise, if a 
well defined deterministic pattern of over- and underreaction is ever found 
then, in addition to chance, a behavioral component |T8[ must always be 



included in any description of the market. Behavioral trends, however, may 
turn out not inconsistent with a pure statistical description if the different 
reaction times of the diverse market components are taken into account, as 



well as the secondary reactions of the components to each other moves [IS 

The emphasis on this paper will be on level-3 analysis and on the recon- 
struction of the processes. Nevertheless we have also dedicated some time to 
the computation, for market fluctuations, of the level- 1 and level-2 statistical 
indicators used in the past for turbulence data. In particular the behavior 
of some of these indicators already provides information on the nature of the 
grammars. This analysis is carried out in Sect. 3. Sect. 4 is dedicated to 
the search for a Gibbs measure and, once the long-memory features of the 
market processes are exhibited. Sect. 5 attempts to describe the processes in 
the framework of chains with complete connections. 

However, the first step in the analysis of any stochastic process is to 
inquire about the stationarity of the process and whether typical samples 
are available. This is the subject of the next section. 
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2 Is there an asymptotically stationary mar- 
ket fluctuation phenomenon? 

Large samples of high-frequency finance data are now available. However 
high-frequency data may not be the more appropriate data to begin under- 
standing the stochastic process that underlies the market mechanism. This 
is because, when comparing minute to monthly variations for example, one 
is comparing systems with very different compositions, trading agents oper- 
ating on the minute scale being in general different from those operating in 
longer time scales. This is evidenced, for example, by the different scahng 
laws for low and high-frequency data. In market data one faces a complexity 
versus statistics trade-off. The high frequency data certainly provides better 
statistics but it also involves the interplay of many more reaction time scales 
and market compositions in the trading process. For this reason, to "purify" 
as much as possible our samples, we have decided to concentrate on daily 
data. The price to be paid for this choice is the fact that, as compared for 
example with a large scale hydrodynamics experiment, the available amount 
of one-day market fluctuation data is relatively small. If, in addition, the 
data is non-stationary, the chances to obtain a reliable statistical analysis 
would be rather slim. 

Reliable application of statistical mechanics tools to any kind of signal, 
presupposes that two conditions are fulfilled. First, that the process that 
generates the data has some kind of underlying stationarity or asymptotic 
stationarity. Second, that the time sequence that is presented to the analysis 
is a typical sample of the process. The second condition, of course, we can 
only hope that it is realized and to improve our belief in this condition several 
different signals of a similar nature should be analyzed (several different 
stocks, or currencies or markets). As to the first condition it requires some 
preprocessing of the data. We will concentrate in this paper in the daily 
fluctuation data of industrial stocks and indexes and the objective is to try 
to extract the features of the market process that acts on them. We look 
at each stock as an experimental probe that, while reacting to the market 
pressures, may reveal some of the mechanisms of the market process. 

Market prices are by nature non-stationary entities. They fluctuate, they 
have general trends that depend on the general state of the economy, on the 
total amount of capital flowing to the market, on the general acceleration of 
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the economy, on long and medium term political decisions and expectations, 
etc. Nevertheless, our hypothesis is that, if all these global factors are ex- 
tracted from the data, there are still some invariant features that characterize 
this peculiar human phenomenon. 





Figure 1: Daily price fluctuation data 



The type of data that will be analyzed is displayed in Fig.l that shows 
daily price data p{t) for three stocks and the NYSE composite index. Its non- 
stationary nature is very apparent. The first step is to extract the general 
trend. This is done, in a smooth way by a polynomial fit q{t) (Fig.2 shows an 
example, where a 7-degrcc polynomial is used). Fig. 3 shows the difference 
p{t) — q{t). Clearly the data is still very far from stationary, because due 
to the market volume acceleration recent fluctuations carry a much larger 
weight. Therefore the last step is a rescaling of the data, by the average 
< p{t) >, that is 

x{t) = {p{t) - q{t)) ^^11^ (1) 
are the signals to be analyzed.. They are shown in Fig. 4. To anyone used to 
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examine turbulence data, it looks as if the market signals are now somewhat 
stable. That does not mean, of course, that they are stationary in the strict 
sense. However it suggests that in spite of currency adjustments, increased 
number of players, trade volumes and other macroeconomic indicators, there 
is something more or less permanent in this human game. 




500 1000 1500 2000 2500 3000 3500 4000 



Figure 2: Detrending by a polynomial 

Detrending and rescaling of the data is important because we will be an- 
alyzing price differences over large time intervals. For one-day differences of 
log-price, the results would be identical to those obtained from the raw data. 
Detrending and rescaling the data, the overall amplitude of price fluctua- 
tions becomes reasonably uniform over the time span of the data. However 
the process is not (locally) stationary, as seen in Figs. 5 and 6 that show 
the strong variation in time of the volatility (here defined as the standard 
deviation of the price fluctuations). The two figures on the left show the stan- 
dard deviation computed on a sliding time window of 10 days. On the right 
one compares the cumulative standard deviation for the rescaled (full line) 
and the non-rescaled data (dashed line). It is quite apparent that only the 
rescaled data has the chance to belong to an asymptotically stationary pro- 
cess. Once the data is detrended and rescaled there is in fact no evidence [pO| 
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Figure 3: Detrended data 



for an abnormal increase, in recent times, of the volatility in the underlying 
process. 

A direct test of stationarity of the detrended and rescaled data was ob- 
tained by coding with a 5-symbols alphabet (as explained in Sect. 4). Then, 
computing the entropies of multi-symbol words, in the first and the second 
half of the samples, no significant difference is found. 



3 Statistical indicators for typical samples 

Here we concentrate on level- 1 and level-2 analysis of the regularized samples 
discussed in Sect. 2, that is, we compute quantities related to averages values 
and to probability distribution functions (pdf 's). The level-3 analysis of the 
processes will be done in the latter sections. 

The main variables that are used to construct the statistical indicators 
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Figure 4: Detrended and rescaled data 

are the differences of log-prices 

r{t, n) = logp {t + n) — logp (t) 



(2) 



sometimes called the n— days return. For each experimental sample, three 
main statistical indicators are computed: 
(i) The maximum (over t) of r{t, n) 



S {n) — max {r{t,n)} 

(ii) The moments of the distribution of |r(t,n)| 

S,{n)^{\r{t,n)n 

with ( ) meaning the sample average 

(iii) If inside a certain range, the moments satisfy 



(3) 



(4) 



Sq{n) ~ n 



(5) 
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Volatility-10 days (IBM) 
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Figure 5: Ten-days window volatility . 
for the rescaled and non-rescaled dat 
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then the scaling exponent x{q) is another important statistical indicator. 

The results obtained from our detrended and rescaled samples are dis- 
played in Figs. 7 to 9. Fig. 7 refers to S (n) and Fig. 8 shows Sq{n) as a function 
of n for different values of q (from top to bottom g = 1 to g = 8). The large 
fluctuations in 6 (n) for large values of n and in Sq{n) for large q are quite 
natural given the size of the data samples. 

In the range n = 2 to n = 60 the moments follow an approximate power 
law of the type of Eq.(|^) and from the behavior in this region we have ex- 
tracted the scaling exponent x(g) shown in Fig. 9. The main conclusions from 
this analysis of the statistical indicators are: 

(a) 6 (n) is log-concave, that is, log 5 (n) is concave as a function of logn, 
increasing and probably (with better statistics) asymptotically constant for 
large r; 

(b) Sq{n) is also an increasing log-concave function of n, allowing a power 
law approximation in a limited range; 

(c) The scaling law x(g) is an increasing concave function of g; 
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Figure 6: Same as in Fig. 5 for Bayer and NYSE 



(d) For all samples, %(!) computed in the scaling region (n = 2 to n = 60) 
is very close to 0.5; 

(e) The scaling properties of the NYSE index seem somewhat different 
from those of the other stocks. However this is only apparent for g > 5, 
where poor statistics effects may already be felt. 

From this analysis one also obtains precise statements concerning the 
similarities and differences between hydrodynamic turbulence and the market 
fluctuation process. Properties (a) to (c) are shared by the turbulence data, 
although the numerical values of the statistical indicators are quite different. 
For example, for turbulence data = | whereas here ^ 0.5, showing 
the essentially uncorrelated nature of the signal for n > 2. The correlation 
function of one-day returns and its absolute value 

C{r{l),T)^{r{t + T,l)r{t,l)) (6) 

and 

C{\r{l)\,T)^{\r{t + T,l)\\r{t,l)\) (7) 
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Figure 7: Maximum 6 (n) of log-prices differences 

are shown in Fig. 10. One sees that for T > 2 the returns are uncorrelated, 
their correlation function remaining at the noise level. In contrast the cor- 
relation for the absolute value remains non-negligible for a longer time (at 
least up to T = 10). This means that although the returns are linearly un- 
correlated, non-linear functions of the returns remain correlated for longer 
periods. 

The behavior of the statistical indicators S{n), Sq{n) and x{q) already 
has some strong implications on the level-3 features of the process, namely 
on the structure of its grammar. In fact, without restrictions on the allowed 
transitions 6 (n) and Sq{n) would be independent of n and x{.q) = for all q. 
In particular, property (a) implies that if the process is a topological Markov 
chain the transitions allowed by the transition matrix T must lie inside a 
strictly convex domain around the diagonal of T^]. 

Fig. 11 illustrates the dynamics of one-day returns 

r(t,l)^r(t + l,l) (8) 

It shows that the bulk of the data consists of a central core of small fluctua- 
tions with a few large flights away from this core. This structure of the data 
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Figure 8: Moments of the \r{t,n)\ distribution 
will have a strong influence on the results obtained in the next section. 

4 Looking for a Gibbs measure 

Let us assume a coding of the dynamical system by a finite alphabet S. Then 

the space fl of orbits of the system are infinite sequences to = iii2 

ifc G S, with the dynamical law being a shift a on these symbol sequences. 

auj ^12 - ■ -ik - • ■ (9) 

Depending on the dynamical law of the coded system, not all sequences 
will be allowed. The set of allowed sequences in Q defines the grammar of 
the shift. The set of all sequences which coincide on the first n symbols is 
called a n— cylinder (or n— block) and is denoted [iii2 ■ ■ ■ ^n]- The probability 
measures over the cylinders is the main tool that is used to characterize the 
dynamical properties and is a piece of information that may be inferred from 
the data. 
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Figure 9: Scaling exponent x{q) 



A particularly important measure on the cylinders is the Gibbs measure 
defined byplj E 



< f^i[ii{uj)i2{uj)---in{^)]) ^ 
~ exp {-nP + {Sn(j)) (uj)) ~ 



(10) 



with {Sn(p) (uj) = X]fc=o ^ i'^'^'^)^ being a Holder continuous function on Q 
called the potential and P (0, G) a function depending on the potential and 
the grammar called the pressure of (p. 

The (equilibrium) Gibbs measure and the pressure bear an important 
relationship to the entropy 

h{ij.) = lim — = lim - jJ. {[iii2 ■■■ in]) ^og fi {[iii2 ■■■ in]) (H) 

n— >oo n n^oo fl ' 



This is the variational principle that states that, for each potential and 
grammar, the sup^ {h{j]) + / (jidrij^ taken over all a— invariant measures r] 
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Figure 10: Correlation functions of r{t, 1) and \r{t, 1)| 

is reached only for the Gibbs measure fi and equals the pressure 

P (0, G) = /i (/i) + j (f)dfz (12) 

The potential may be chosen in such a way that P = 0. Such potential 
is called a normalized potential. In this case we have the following result 

[u) = hm log — -TT (13) 

n-*oo fi {[12{UJ) ■ ■ ■ ln{uj)\) 

In principle this formula may be used to construct the potential using the 
empirical measures ]l{[ii{uj) ■ ■ - iniuj)]) obtained from the experimental sam- 
ple. The problem is that Eq.(|T^ requires the use of blocks of length n as 
large as possible but, for a finite sample, the statistics of such blocks suffers 
from large uncertainties. 

For practical purposes the most important class of Gibbs measures is 
the one associated to finite range potentials, that is, functions on Q that 
depend only on the first r symbols of a sequence u ^ Q. The importance 
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Figure 11: The dynamics of one-day returns 



of finite range potentials lies in the fact that they may be used to uniformly 
approximate any Holder continuous potential and, on the other hand, given 
a limited amount of experimental data, only finite-range potentials may be 
reliably inferred from experiment. 

An important property of range— r potentials is that for all values iii2 ■ ■ - in 
with r?, > r [0 

We will make use of this important relation in our attempt to look for a Gibbs 



measure for the market fluctuation data. On the one hand the relation (|l^) 
allows to express the entropy in terms of measures of cylinders of finite length 
only, namely 

(/.) = - V /i ([^1 ■■ ■ z,]) log =H,- H,_, (15) 

for all /c > r if r > 1. If r = 1 h (/i) = Hi. Hk is the entropy associated to 
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cylinders of length k 



Hk = At ([ii ■ --ik]) log^ ■ ■ -ifc]) (16) 

This provides a criterium to find the range of the potential. Using the empir- 
ical cylinder probabilities one computes Hk for successively larger k. Then, 
the range of the potential is found when Hk — Hk-i tends to a constant value. 
Once the range is found, the potential may be constructed directly from the 
empirical weights Ji {[ii ■ ■ ■ ik])- 

Another important consequence of Eq. ([I^) is that for k > r 



^ ■ ■ ■ ''^'^^ ~ ^'^^ 

We will use both the criterium following from (|15D and Eq.([T7[) to test for 
the possibility to construct a Gibbs measure for the market fiuctuation data. 
A five-symbols code E 

S = {-2,-1,0,1,2} (18) 
is used for the one-day return data r (t) 

r{t) = \ogp{t+l)-\ogp{t) (19) 



The average r (t) and standard deviation s = y ^r^ (f) — r (t) j of the re- 
turns are computed. Then, 



r (t) - 7(t)) > s 2 
s> (r (t) - VJt)^ > f 1 
f > (^r(t) -7(t)) > -f (20) 

-i > (T (r (t) -7{t)^ > -s -1 

-s > (^r (t) - VJt)^ -2 



This coding is used and the empirical frequencies n {[ii ■ ■ ■ in]) for blocks 
of successively larger order n are found. Of course n cannot be arbitrarily 
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Figure 12: — Hk-i and the number of occurring blocks of size k (IBM 
and Bayer) 

large because of statistics. Results will not be reliable whenever 5" is larger 
than the size N of the data sample. The statistical reliability may be directly 
tested either by comparing the number of different occurring blocks and 5" 
or by observing the fall-off of the empirically computed — Hn^i- 

First we try to estimate a possible range for the potential using the cri- 
terium discussed above. The results are shown in Figs. 12 and 13 for the 
analyzed stocks and the NYSE index. The plots on the left show the quan- 
tity Hk — Hk_i and the plots on the right compare the number p (k) of 
occurring blocks of size k in the data with the maximum possible number, 
5^^. Already for /c = 2 the difference Hk — H^-i seems to stabilize, staying 
nearly constant until k — A. After A; = 4 it falls off, reflecting the lack of 
statistics also apparent in the comparison oi p{k) with S'^ in the right hand 
side plots. These results seem to suggest that the data is described by a 
very short range potential. Notice that for a similar analysis performed on 
hydrodynamic turbulence data the results are quite different with — H^-i 
rising smoothly up to a certain saturation level and then decreasing when 
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Figure 13: Same as Fig.l2 for BMW and NYSE 

one reaches the lack of statistics level. 

To check whether the short-range potential suggested by this criterium 
is reliable or whether it simply results from some misleading feature of the 
data, we have performed the test following from Eg . (|l7|) . For successively 
higher k we estimate /Xe ([zi ■ • ■ ik+i]) = ^^^'^ '^'([ifi^I^]')^'''''^^'' from the empirical 
Ji' {[h ■ ■ ■ ik]) and /I ([ii ■ ■ ■ ik-i]), which is then compared with the empirically 
observed ■ ■ ■ The standard deviation of the relative positive er- 

rors _ 

^ _ f(. {[h ■ ■ ■ ik+i]) - fie {[h ■ ■ ■ ik+i]) 

is computed and the number of blocks for which this error is one and two 
standard deviations above the mean is computed. The result is plotted in 
Fig. 14 where the number of underestimation errors that are one (o) and two 
(*) standard deviations away from the mean error are compared with the 
total number p{k) of different observed blocks of each length k. 

One sees that the number of large deviation errors is very large and, 
identifying the blocks for which these errors occur, one finds out that they 
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Figure 14: Underestimation errors one (o) and two (*) standard deviations 
away from the mean and the total number p{k) of observed blocks 



all correspond to blocks involving large positive or negative r(t)'s (2 and 
—2). The conclusion is that a short-range potential would describe the small 
fluctuations in the data, the large fluctuations being badly described by it. 
The reason why the empirically found difference Hk — Hk-i seems to saturate 
for a small k is because, as is apparent from Fig. 11, the bulk of the data 
consists mostly of small fluctuations plus a few large flights. The saturation 
of Hk — Hk-i for small A; is a reflection of the largely uncorrelated nature 
of the small fluctuations, whereas other features like the large deviations, 
persistence of non-linear correlations (volatility), etc. are not captured by a 
short-range potential. 

Large deviations being misrepresented by an empirically constructed mea- 
sure is typical of situations where the actual measure is non-Gibbsian[[ 
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In our case, however, it may also occur that the measure is Gibbsian but with 
a long-range potential. This would correspond to a sharp rise of Hk — H^-i 
dX k = 2 followed by a very slow increase above k = 2. In the empirical 
results a small increase may be hidden by the fact that, as the block length 
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increases, the statistics becomes poorer. A large deviation analysis applied 
to the calculation of Hk , using a standard technique ||25[] to construct the 
free energy and the deviation function from the data, is consistent with this 
hypothesis. 

In any case, whether a Gibbs measure exists or not, the finite-range po- 
tential framework does not seem to be the more convenient way to describe 
the market fluctuation process. In the next section we will explore another 
approach specially suited to deal with long-memory processes. 



5 Market fluctuations as a chain with com- 
plete connections 

Processes with long memory have been studied in the past. Under certain 
conditions, that is, when the dependence on the past does not decay too 
slowly, existence and uniqueness of a well defined process may be proved. A 
particularly well established framework is the one of chains with complete 
connections and summable decays (|2^ and references therein). 

A stochastic process {Xn}n& with alphabet S is said to be a chain with 
complete connections (CCC) if the following conditions are satisfied 

1. Va^ e S 

P(Xi = ai,---,X„ = aJ>0 (22) 

2. The limit 

lim P (Xq = aQ\Xj = aj, —m < j < —1) = P {Xq = aolXj = aj,j < —1] 

(23) 

exists Voj, j < — 1 

3. There is a sequence (7m)m>i with limm^ooTm = 0, such that for all 
{ttj, bj G S, J < —1} with ttj = bj for — m < j < —1 



PjXo = ao\Xj = ajj < -1) 
PiXo = ao\X^=b„j<-l) 



< Im (24) 



The process is said to be a chain with complete connections and summable 
decay (CCCSD) if ^ 7^, < cxd 
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Conditions 1. and 2. are implicitly assumed when we considered the 
processes (and pre-processed the data) to be asymptotically stationary. As 
for the decays 7^ they may be estimated from a typical sample of the process. 
Prom the empirical probabilities for 

P ao ai---a^A = — — 25 

P (oi • • • amA) 

where A is a block of arbitrary length, ones computes for each fixed set 
a^ai ■ ■ ■ the maximum and the minimum over A, 



g{aoai---am) = ( ,„ : — ~ ^ 1 (26) 

obtaining for 7^ 



maxA P 


(ao ai • 


■ ■ amA) 


min^ P 1 


[ao\ai ■ ■ 


■ ■ dmA) 



^rn = max g {a^ai ■■■am) (27) 

aoai---am. 

However if the statistics for very long blocks is poor, which is in general 
the case for finite samples, the computation of the maximum from empirical 
data is not reliable. A better estimate of the decay behavior of the decay 
rates is obtained from the following quantity, which smooths out the large 
fiuctuations due to poor statistics 



g{m) = g {a^ai ■■■am) (28) 

the average being taken over all sets aoOi • • • of size m. 

The results obtained for the E— coded data of the detrended fluctuations 

(of BMW data) using A blocks of length 5 to 8 (x , +, o, *) is plotted in Fig.l5. 
Similar results are obtained for the other data. The result is compatible with 
exponential decay, which would probably imply the existence of a Gibbs 
measure (albeit with a long range potential). The data for the maxima of 
g (aotti • • • ttm) displays large fluctuations and slower decay. However, with 
the amount of available data it is not reliable for long blocks. In any case, 
in the present context of CCC-processes, what the result suggests is the 
summability of the 7m's. (^„j7m < 00). For practical purposes the most 
importance consequence of this fact is that a CCC-process with summable 
decays is the (i— limit of its Markov approximations of order k. The nature 
of this approximation should however be clearly understood. 
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Figure 15: g (m) computed using A blocks of length 5 to 8 (x , +, o, *) 



The d— distance^]l between two processes refers not to the processes 
themselves but to the process that implements the coupling of the two pro- 
cesses. A coupling between two processes X = and Y = {Yn} over the 

alphabet S is another process |x„,y„| defined over S x S such that the 

marginal probabilities of X and Y coincide with those of X and Y. Then 
the (i— distance between X and Y is 

d {X, Y) = inf |p (^Xq ^ Yq^ : y„| is a stationary coupling of X and y| 

(29) 

For some types of coupling the two processes X and Y are know to coincide 
after a certain random time. However, for the original processes X and Y, 
if the (i— distance tends to zero it does mean that the processes will coincide 
after a certain time. It only means that it will occur for some other processes 
with the same marginal probabilities. 

This fact has an important bearing on the correct interpretation of the 
"perfect simulation" schemes |^ proposed for CCC's. Perfect simulation is 
always understood in the ci— distance sense and it does mean perfect predic- 
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tion. It means simply that a process is constructed with the same conditional 
probabilities of the original process, whenever the conditional probabilities 
of the original process are known. In practice not all conditional probabili- 
ties involving infinite pasts are needed, because going back to a regeneration 
time, only a finite number of back steps are required. Several simulation 
schemes have been proposed for CCC's with summable decays. The most 
important one for the applications, when the conditional probabilities are in- 
ferred from experiment, is the sequence of canonical Markov approximations 
of finite order k (A;— CMA). A A;— CMA of a process X is a Markov chain 
y(*:) of order k with conditional probabilities P^'^^ such that 

P^'') (ao|ai ■ ■ - ttk) = P {ao\ai- ■ - ak) = ^P (ao|ai ■ ■ ■ akA) (30) 

A 

For a CCC X with summable decays PB[| 

d(x,y('=)) <Cjk (31) 

C > being a constant. Actually the property of the Markov approximation 
that is essential for the approximation result (|3T|) is 

inf P (aoloi ■ ■ ■ akA) < P^''^ (aolai ■ ■ ■ a^) < sup P (aolai ■ ■ ■ akA) (32) 

A A 

meaning that for Markov approximation schemes, other than the canonical 
one, Eq.(pTD holds provided (|32|) is satisfied. In fact, when the conditional 
probabilities are inferred from limited experimental data a different Markov 
approximation is more convenient. 

The following approximation scheme is proposed for the market fluctua- 
tion data, which we call the < fc— Markov approximation: 

i) Empirical transition probabilities P (aol^^i ■ ■ ■ o-m) are inferred from the 
occurrence probability of blocks of order m + 1. up to a certain order niMax- 
Of course, only probabilities that correspond to blocks ai ■ ■ ■ that appear 
in the data will be available and especially for large m many will be missing. 

ii) For the simulation, with an approximation of order k, one looks at the 
current block (ai ■ ■ ■ a^) of order k and uses the fc— empirical probability to 
infer the next state Oq. If that block has not appeared in the data that was 
used to construct the empirical probabilities, then one looks at the k — 1 sized 
block 02 ■ • ■ Ofe and uses the k — 1 order empirical probabilities. If necessary 
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the process is repeated until an available empirical probability is found. This 
is the reason why this is called the < fc— Markov approximation. 

This approximation scheme has been applied to the market fluctuation 
data and for each A;— order the successor of each block (oi • • • Ofe) is com- 
pared with a prediction Qq obtained by throwing a random number with the 
probabilities P (aoki ■ ■ ■ a.fc). Figs. 16 shows some of the results. In all cases 
the quantity that is plotted is the averaged squared error 

= {{ao - aof) (33) 

the average being taken over the samples and 100 different runs. The two 
upper plots and the left lower plot show the results obtained (for each ap- 
proximation order k) when half of the data for each company is used to 
predict the other half. The points labelled (o) correspond to the past used 
to predict the future and those labelled (*) to the future used to predict 
the past. Finally the right lower plot shows the results obtained when oq is 
chosen at random (for the 3 companies, IBM (o), Bayer (*) and BMW (+)). 
The main conclusions that may be extracted from these results are: 

• The average prediction obtained from using the empirical probabilities 
is better than a random choice. 

• However, the main improvement is a result of a correct accounting of 
the two-symbol probabilities (A; = 1). 

• After the improvement due to the use of the lowest order blocks a small 
(but consistent) improvement is found by using the past information 
up A; = 4 or 5. No significant improvement is obtained by using higher 
order approximations. This is consistent with the poorer statistics of 
large blocks. Actually for each individual simulation the result of using 
A; > 5 leads to much larger fluctuations. 

The main conclusion is that although the bulk of the data is represented 
by a short-memory process, there is nevertheless evidence for a small long- 
memory component that is captured by the higher-order Markov approxi- 
mations. Depending on the amount of data that is available to infer the 
empirical conditional probabilities there is a maximum k = that should 
be used for the simulation process. This km value may be estimated from 
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Figure 16: The past predicting the future (o) and the future predicting the 
past (*), compared to random choice 

the quantity p (k) plotted in Figs 12 and 13. Finally, although a mild gain is 
obtained from using A;^— block probabilities rather than one-symbol proba- 
bilities, it should be remembered that perfect simulation in the (i— distance 
sense is not perfect prediction for the actual process. This is a point to keep 
in mind when attempting to develop any trading strategies based on the 
empirical block probabilities. 

We have also explored the use of the empirical probabilities of one com- 
pany to predict the behavior of the others. In all cases the improvement 
coming from the one-symbol probabilities (as compared to random choice) 
is obtained. This means that the one-symbol probabilities are similar in all 
companies. However for the long-memory component the behavior is very 
much company-dependent. For example there seems to be no correlation of 
this component between IBM and the other two companies, with the predic- 
tion being actually worse when the empirical probabilities for longer blocks 
are used. The same happens also when the empirical probabihties of BMW 
and Bayer are used to predict IBM. However there is some statistical cor- 
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relation between the long-memory components (and some mild prediction 
improvement) between BMW and Bayer. This suggests that the statistical 
short-memory component of the market process might be similar for many 
different stocks, whether the long-memory component might be different from 
market to market and to divide the stocks into classes. A similar conclusion 



follows from the stocks taxonomy obtained by Mantegna [30|, although that 



work does not distinguish between the short- and long-memory components 
of the process. 



6 Conclusions 

1. The bulk of the market fluctuation process seems to be a short-memory 
process. In addition it has a small long-memory component, which 
however is very important for practical purposes because it is associated 
with the large fluctuations of the returns. 

2. Existence of the long- memory component suggests the chains with com- 
plete connections and summahle decays as the appropriate framework 
to describe these processes. Although the decays may be exponentially 
converging, the lack of accurate data concerning long blocks prevent 
an accurate description by a finite range Gibbs potential. 

3. The sequence of empirical based < A;— Markov approximations dis- 
cussed in Sect. 5 seems the most unbiased simulation of the process. 
Eventual convergence in the ci— distance sense is expected to hold be- 
cause the market fluctuation process seems to fit in the framework of 
chains with complete connections and summable decays. 

4. Except for cases where one is sure of the existence of a finite poten- 
tial, Markov approximations must always be used if only finite data 
is available. This true whether a Gibbs measure exists or not. What 
the chains with complete connections framework provides though, is 
a rationale for the convergence of the Markov approximations and a 
criterium to estimate, through the 7^ decays, how good this approxi- 
mation is. Notice however the trade-off between higher order approx- 
imations and lack of statistics, that leads to an optimal block length 
for the empirical probabilities to be used in the simulations. 
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5. As work for the future we point out that it would be interesting to 
analyze in this framework the high frequency market data. Here how- 
ever attention should be paid to the possibly multi-scale and multi- 
component nature of the processes. 
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